44 research outputs found

    On Lasso refitting strategies

    Full text link
    A well-know drawback of l_1-penalized estimators is the systematic shrinkage of the large coefficients towards zero. A simple remedy is to treat Lasso as a model-selection procedure and to perform a second refitting step on the selected support. In this work we formalize the notion of refitting and provide oracle bounds for arbitrary refitting procedures of the Lasso solution. One of the most widely used refitting techniques which is based on Least-Squares may bring a problem of interpretability, since the signs of the refitted estimator might be flipped with respect to the original estimator. This problem arises from the fact that the Least-Squares refitting considers only the support of the Lasso solution, avoiding any information about signs or amplitudes. To this end we define a sign consistent refitting as an arbitrary refitting procedure, preserving the signs of the first step Lasso solution and provide Oracle inequalities for such estimators. Finally, we consider special refitting strategies: Bregman Lasso and Boosted Lasso. Bregman Lasso has a fruitful property to converge to the Sign-Least-Squares refitting (Least-Squares with sign constraints), which provides with greater interpretability. We additionally study the Bregman Lasso refitting in the case of orthogonal design, providing with simple intuition behind the proposed method. Boosted Lasso, in contrast, considers information about magnitudes of the first Lasso step and allows to develop better oracle rates for prediction. Finally, we conduct an extensive numerical study to show advantages of one approach over others in different synthetic and semi-real scenarios.Comment: revised versio

    Minimax semi-supervised confidence sets for multi-class classification

    Get PDF
    In this work we study the semi-supervised framework of confidence set classification with controlled expected size in minimax settings. We obtain semi-supervised minimax rates of convergence under the margin assumption and a Hölder condition on the regression function. Besides, we show that if no further assumptions are made, there is no supervised method that outperforms the semi-supervised estimator proposed in this work. We establish that the best achievable rate for any supervised method is n^{−1/2} , even if the margin assumption is extremely favorable. On the contrary, semi-supervised estimators can achieve faster rates of convergence provided that sufficiently many unlabeled samples are available. We additionally perform numerical evaluation of the proposed algorithms empirically confirming our theoretical findings

    Small Total-Cost Constraints in Contextual Bandits with Knapsacks, with Application to Fairness

    Full text link
    We consider contextual bandit problems with knapsacks [CBwK], a problem where at each round, a scalar reward is obtained and vector-valued costs are suffered. The learner aims to maximize the cumulative rewards while ensuring that the cumulative costs are lower than some predetermined cost constraints. We assume that contexts come from a continuous set, that costs can be signed, and that the expected reward and cost functions, while unknown, may be uniformly estimated -- a typical assumption in the literature. In this setting, total cost constraints had so far to be at least of order T3/4T^{3/4}, where TT is the number of rounds, and were even typically assumed to depend linearly on TT. We are however motivated to use CBwK to impose a fairness constraint of equalized average costs between groups: the budget associated with the corresponding cost constraints should be as close as possible to the natural deviations, of order T\sqrt{T}. To that end, we introduce a dual strategy based on projected-gradient-descent updates, that is able to deal with total-cost constraints of the order of T\sqrt{T} up to poly-logarithmic terms. This strategy is more direct and simpler than existing strategies in the literature. It relies on a careful, adaptive, tuning of the step size

    Gradient-free optimization of highly smooth functions: improved analysis and a new algorithm

    Full text link
    This work studies minimization problems with zero-order noisy oracle information under the assumption that the objective function is highly smooth and possibly satisfies additional properties. We consider two kinds of zero-order projected gradient descent algorithms, which differ in the form of the gradient estimator. The first algorithm uses a gradient estimator based on randomization over the â„“2\ell_2 sphere due to Bach and Perchet (2016). We present an improved analysis of this algorithm on the class of highly smooth and strongly convex functions studied in the prior work, and we derive rates of convergence for two more general classes of non-convex functions. Namely, we consider highly smooth functions satisfying the Polyak-{\L}ojasiewicz condition and the class of highly smooth functions with no additional property. The second algorithm is based on randomization over the â„“1\ell_1 sphere, and it extends to the highly smooth setting the algorithm that was recently proposed for Lipschitz convex functions in Akhavan et al. (2022). We show that, in the case of noiseless oracle, this novel algorithm enjoys better bounds on bias and variance than the â„“2\ell_2 randomization and the commonly used Gaussian randomization algorithms, while in the noisy case both â„“1\ell_1 and â„“2\ell_2 algorithms benefit from similar improved theoretical guarantees. The improvements are achieved thanks to a new proof techniques based on Poincar\'e type inequalities for uniform distributions on the â„“1\ell_1 or â„“2\ell_2 spheres. The results are established under weak (almost adversarial) assumptions on the noise. Moreover, we provide minimax lower bounds proving optimality or near optimality of the obtained upper bounds in several cases

    Leveraging Labeled and Unlabeled Data for Consistent Fair Binary Classification

    Get PDF
    International audienceWe study the problem of fair binary classification using the notion of Equal Opportunity. It requires the true positive rate to distribute equally across the sensitive groups. Within this setting we show that the fair optimal classifier is obtained by recalibrating the Bayes classifier by a group-dependent threshold. We provide a constructive expression for the threshold. This result motivates us to devise a plug-in classification procedure based on both unlabeled and labeled datasets. While the latter is used to learn the output conditional probability, the former is used for calibration. The overall procedure can be computed in polynomial time and it is shown to be statistically consistent both in terms of the classification error and fairness measure. Finally, we present numerical experiments which indicate that our method is often superior or competitive with the state-of-the-art methods on benchmark datasets

    Fair Regression via Plug-in Estimator and Recalibration With Statistical Guarantees

    Get PDF
    International audienceWe study the problem of learning an optimal regression function subject to a fairness constraint. It requires that, conditionally on the sensitive feature, the distribution of the function output remains the same. This constraint naturally extends the notion of demographic parity, often used in classification, to the regression setting. We tackle this problem by leveraging on a proxy-discretized version, for which we derive an explicit expression of the optimal fair predictor. This result naturally suggests a two stage approach, in which we first estimate the (unconstrained) regression function from a set of labeled data and then we recalibrate it with another set of unlabeled data. The recalibration step can be efficiently performed via a smooth optimization. We derive rates of convergence of the proposed estimator to the optimal fair predictor both in terms of the risk and fairness constraint. Finally, we present numerical experiments illustrating that the proposed method is often superior or competitive with state-of-the-art methods
    corecore